On Text-based Mining with Active Learning and Background Knowledge Using SVM
نویسندگان
چکیده
منابع مشابه
Text Identification in Complex Background Using SVM
This paper presents a fast and robust algorithm to identify text in image or video frames with complex backgrounds and compression effects. The algorithm first extracts the candidate text line on the basis of edge analysis, baseline location and heuristic constraints. Support Vector Machine (SVM) is then used to identify text line from the candidates in edge-based distance map feature space. Ex...
متن کاملText Clustering Based on Background Knowledge
Text document clustering plays an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. Standard partitional or agglomerative clustering methods efficiently compute results to this end. However, the bag of words representation used for these clustering methods is often unsatisfactory as it ...
متن کاملA Active Learning with SVM
With the increasing demand of multimedia information retrieval, such as image and video retrieval from the Web, there is a need to find ways to train a classifier when the training dataset is combined with a small number of labelled data and a large number of unlabeled one. Traditional supervised or unsupervised learning methods are not suited to solving such problems particularly when the prob...
متن کاملMining Associations in Text in the Presence of Background Knowledge
This paper describes the FACT system for knowledge discovery from text. It discovers associations − patterns of co-occurrence − amongst keywords labeling the items in a collection of textual documents. In addition, FACT is able to use background knowledge about the keywords labeling the documents in its discovery process. FACT takes a query-centered view of knowledge discovery, in which a disco...
متن کاملSVM-Based Spam Filter with Active and Online Learning
A realistic classification model for spam filtering should not only take account of the fact that spam evolves over time, but also that labeling a large number of examples for initial training can be expensive in terms of both time and money. This paper address the problem of separating legitimate emails from unsolicited ones with active and online learning algorithm, using a Support Vector Mac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Soft Computing
سال: 2006
ISSN: 1432-7643,1433-7479
DOI: 10.1007/s00500-006-0080-8